Towards Detailed Recognition of Visual Categories

نویسنده

Subhransu Maji

چکیده

As humans, we have a remarkable ability to perceive the world around us in minute detail purely from the light that is reflected off it – we can estimate material and metric properties of objects, localize people in images, describe what they are doing, and even identify them. Automatic methods for such detailed recognition of images are essential for most human-centric applications and large scale analysis of the content of media collections for market research, advertisement, and social studies. For example, in order to shop for shoes in an on-line catalogue, a system should be able to understand the style of a shoe, the length of its heels, or the shininess of its material. In order to support visual demographics analysis for advertisement, a system should be able to not only identify the people in a scene, but also to understand what kind (style and brand) of clothes they are wearing, whether they are wearing any accessories, and so on. Despite several successes, such detailed recognition is beyond the current computer vision systems. This is a challenging task, and to make progress we have to make advances on several fronts. We need better representations of visual categories that can enable fine-grained reasoning about their properties, as well as machine learning methods that can leverage ‘big-data’ to learn such representations. In order to enable benchmarks for evaluating recognition tasks and to guide learning and inference in models that solve challenging problems, we need to develop better ways of human-computer interaction. My research touches upon several such themes in the intersection of computer vision, machine learning, and human-computer interaction including:

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Eye Movement Analysis to Study Auditory Effects on Visual Memory Recall

Recent studies in affective computing are focused on sensing human cognitive context using biosignals. In this study, electrooculography (EOG) was utilized to investigate memory recall accessibility via eye movement patterns. 12 subjects were participated in our experiment wherein pictures from four categories were presented. Each category contained nine pictures of which three were presented t...

متن کامل

A Computational Approach towards Visual Object Recognition at Taxonomic Levels of Concepts

It has been argued that concepts can be perceived at three main levels of abstraction. Generally, in a recognition system, object categories can be viewed at three levels of taxonomic hierarchy which are known as superordinate, basic, and subordinate levels. For instance, "horse" is a member of subordinate level which belongs to basic level of "animal" and superordinate level of "natural object...

متن کامل

Human Action Recognition and Shape Segmentation-Recognition

Human Action Recognition. Human action recognition has broad range of applications such as video search, sports analysis, human robotics interactions, and health care. Our work is organized in two directions: 1) detailed pixel-level ‘motion and pose’, focusing on close interactions among people; 2) action recognition focusing on goal oriented motion, simplified as ‘action = motion + intention’....

متن کامل

How Humans Describe Short Videos

Recognition, manipulation and representation of visual objects can be simplified significantly by “abstraction”. By definition abstraction extracts essential features and properties while it neglects unnecessary details. We have conducted two sets of experiments in order to relate abstraction levels used by humans when describing videos, to abstraction level categories used in computer vision. ...

متن کامل

Active Object Exploration in Toddlers and its Role in Visual Object Recognition

Adult active object exploration of novel objects is highly focused; particularly considering the time spent performing specific visual transformations of an object as found in previous experimental studies. The most stereotypical is rotating around object orientations where the object’s main axis is either elongated (e.g. a side) or foreshortened (e.g. the top or the bottom). These orientations...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Towards Detailed Recognition of Visual Categories

نویسنده

چکیده

منابع مشابه

Using Eye Movement Analysis to Study Auditory Effects on Visual Memory Recall

A Computational Approach towards Visual Object Recognition at Taxonomic Levels of Concepts

Human Action Recognition and Shape Segmentation-Recognition

How Humans Describe Short Videos

Active Object Exploration in Toddlers and its Role in Visual Object Recognition

عنوان ژورنال:

اشتراک گذاری